skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Yuan, Lyuheng"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available September 17, 2026
  2. Free, publicly-accessible full text available June 1, 2026
  3. Accurate and timely mapping of flood extent from high-resolution satellite imagery plays a crucial role in disaster management such as damage assessment and relief activities. However, current state-of-the-art solutions are based on U-Net, which cannot segment the flood pixels accurately due to the ambiguous pixels (e.g., tree canopies, clouds) that prevent a direct judgement from only the spectral features. Thanks to the digital elevation model (DEM) data readily available from sources such as United States Geological Survey (USGS), this work explores the use of an elevation map to improve flood extent mapping. We propose, EvaNet, an elevation-guided segmentation model based on the encoder-decoder architecture with two novel techniques: (1) a loss function encoding the physical law of gravity that if a location is flooded (resp. dry), then its adjacent locations with a lower (resp. higher) elevation must also be flooded (resp. dry); (2) a new (de)convolution operation that integrates the elevation map by a location-sensitive gating mechanism to regulate how much spectral features flow through adjacent layers. Extensive experiments show that EvaNet significantly outperforms the U-Net baselines, and works as a perfect drop-in replacement for U-Net in existing solutions to flood extent mapping. EvaNet is open-sourced at https://github.com/MTSami/EvaNet 
    more » « less
  4. Subgraph search problems such as maximal clique enumeration and subgraph matching generate a search-space tree which is traversed in depth-first manner by serial backtracking algorithms that are recursive. Since Jenkins et al. reported the backtracking paradigm to be sub-optimal for GPU acceleration, breadth-first traversal of the search-space tree is widely adopted by GPU algorithms. However, they produce a lot of intermediate subgraphs that exhaust the GPU device memory. Recent works revive the depth-first backtracking paradigm for GPU acceleration, where each warp is a basic processing unit with its own stack in device memory for subgraph backtracking. However, they adopt complicated methods for load balancing that incur a lot of overheads. They also use hardcoded fixed space for stacks that is determined ad-hoc and may lead to inaccuracy when the allocated space is insufficient. In this paper, we use subgraph matching as a case study to propose novel depth-first GPU solutions to address the above problems. Our approach, called T-DFS, decomposes the compu- tation into independent tasks that process search-space subtrees, which are managed by an efficient lock-free circular task queue. Tasks are distributed to different warps for parallel processing, and a novel timeout mechanism is used to eliminate straggler tasks to ensure load balancing. We also support flexible and fine- grained dynamic memory allocation for stack spaces to avoid the stack space allocation pitfalls of existing works. Extensive experi- ments on real graphs show that T-DFS significantly outperforms existing depth-first GPU solutions for the subgraph matching application. 
    more » « less
  5. Graph-theoretic algorithms and graph machine learning models are essential tools for addressing many real-life problems, such as social network analysis and bioinformatics. To support large-scale graph analytics, graph-parallel systems have been actively developed for over one decade, such as Google’s Pregel and Spark’s GraphX, which (i) promote a think-like-a-vertex computing model and target (ii) iterative algorithms and (iii) those problems that output a value for each vertex. However, this model is too restricted for supporting the rich set of heterogeneous operations for graph analytics and machine learning that many real applications demand. In recent years, two new trends emerge in graph-parallel systems research: (1) a novel think-like-a-task computing model that can efficiently support the various computationally expensive problems of subgraph search; and (2) scalable systems for learning graph neural networks. These systems effectively complement the diversity needs of graph-parallel tools that can flexibly work together in a comprehensive graph processing pipeline for real applications, with the capability of capturing structural features. This tutorial will provide an effective categorization of the recent systems in these two directions based on their computing models and adopted techniques, and will review the key design ideas of these systems. 
    more » « less
  6. Finding from a big graph those subgraphs that satisfy certain conditions (aka. subgraph search) is useful in many applications such as community detection and subgraph matching. These problems often generate a search-space tree with size exponential to the size of the input graph. GPUs with thousands of cores are a natural choice to speed up subgraph search, but existing GPU solutions either conduct BFS on the search-space tree which leads to memory overflow due to intermediate subgraph-size explosion, or they conduct DFS on the search-space tree which is memory-efficient but can be 2 orders of magnitude slower than a BFS solution. In this paper, we present G2-AIMD, a subgraph-centric framework for efficient subGraph Search on GPUs, which enjoys the efficiency of BFS on the search-space tree, while avoids intermediate subgraph-size explosion with novel system designs such as adaptive chunk-size adjustment and host-memory subgraph buffering, inspired by the additive-increase/multiplicative-decrease (AIMD) algorithm in TCP congestion control. G2-AIMD provides a convenient subgraph-centric programming interface to facilitate the implementation of subgraph search algorithms on top, so as to enjoy the above performance merits. G2-AIMD also supports multi-GPU execution where each GPU only needs to load a fraction of the input graph. To demonstrate the efficiency and scalability of G2-AIMD, two algorithms were implemented on top with additional optimization techniques, and they significantly outperform the existing GPU solutions. 
    more » « less
  7. In this demonstration paper, we describe FSM-Explorer, an interactive tool for that makes it easier for end-users to mine frequent subgraph patterns from a big graph G, and to explore the subgraph instances in G that match the patterns. FSM-Explorer not only supports the popular MNI support measure, but also the recently proposed Fraction-Score measure that is more accurate. Its backend engine is built on top of the recent T-FSM system that ensures high concurrency, bounded memory consumption, and effective load balancing. Using real-world data, we showcase how users can mine frequent subgraph patterns by parameter tuning in FSM-Explorer, and how they can conveniently examine the many matched instances in G one batch at a time to improve productivity. 
    more » « less
  8. Finding frequent subgraph patterns in a big graph is an important problem with many applications such as classifying chemical compounds and building indexes to speed up graph queries. Since this problem is NP-hard, some recent parallel systems have been developed to accelerate the mining. However, they often have a huge memory cost, very long running time, suboptimal load balancing, and possibly inaccurate results. In this paper, we propose an efficient system called T-FSM for parallel mining of frequent subgraph patterns in a big graph. T-FSM adopts a novel task-based execution engine design to ensure high concurrency, bounded memory consumption, and effective load balancing. It also supports a new anti-monotonic frequentness measure called Fraction-Score, which is more accurate than the widely used MNI measure. Our experiments show that T-FSM is orders of magnitude faster than SOTA systems for frequent subgraph pattern mining. Our system code has been released at https://github.com/lyuheng/T-FSM. 
    more » « less
  9. The k-core of a graph is the largest induced sub-graph with minimum degree k. The problem of k-core decomposition finds the k-cores of a graph for all valid values of k, and it has many applications such as network analysis, computational biology and graph visualization. Currently, there are two types of parallel algorithms for k-core decomposition: (1) degree-based vertex peeling, and (2) iterative h-index refinement. There is, however, few studies on accelerating k-core decomposition using GPU. In this paper, we propose a highly optimized peeling algorithm on a GPU, and compare it with possible implementations on top of think-like-a-vertex graph-parallel GPU systems as well as existing serial and parallel k-core decomposition algorithms on CPUs. Extensive experiments show that our GPU algorithm is the overall winner in both time and space. Our source code is released at https://github.com/akhlaqueak/KCoreGPU. 
    more » « less